Combination of Sparse Classification and Multilayer Perceptron for Noise-robust ASR
نویسندگان
چکیده
On the AURORA-2 task good results at low SNR levels have been obtained with a system that uses state posterior estimates provided by an exemplar-based sparse classification (SC) system. At the same time, posterior estimates obtained with a multilayer perceptron (MLP) yield good results at high SNRs. In this paper, we investigate the effect of combining the estimates from the SC and MLP systems at the probability level. More precisely, the probabilities are combined by a sum or a product rule using static and inverse-entropy based dynamic weights. In addition, we investigate a modified dynamic weighting approach which enhances the contribution of the SC stream based on the information about static weights and average dynamic weights obtained on cross-validation data. Our study on the AURORA-2 task shows that in all conditions the modified dynamic weighting approach yields a dual-input system that performs better than or equal to the best stand-alone system.
منابع مشابه
Comparison of HMM experts with MLP experts in the full combination multi-band approach to robust ASR
In this paper we apply the Full Combination (FC) multi-band approach, which has originally been introduced in the framework of posterior-based HMM/ANN (Hidden Markov Model/Artificial Neural Network) hybrid systems, to systems in which the ANN (or Multilayer Perceptron (MLP)) is itself replaced by a Multi Gaussian HMM (MGM). Both systems represent the most widely used statistical models for robu...
متن کاملBioinspired sparse spectro-temporal representation of speech for robust classification
In this work, a first approach to a robust phoneme recognition task by means of a biologically-inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate t...
متن کاملRobust Asr in Reverberant Environments Using Temporal Cepstrum Smoothing for Speech Enhancement and an Amplitude Modulation Filterbank for Feature Extraction
This paper presents techniques aiming at improving automatic speech recognition (ASR) in single channel scenarios in the context of the REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge. System improvements range from speech enhancement over robust feature extraction to model adaptation and word-based integration of multiple classifiers. The selective temporal cepstrum ...
متن کاملSpectral Entropy Feature in Multi-Stream for Robust ASR
In recent papers, entropy computed from sub-bands of the spectrum was used as a feature for automatic speech recognition. In the present paper, we further study the sub-band spectral entropy features which can give the flatness/peakiness of the sub-band spectrum and in turn the position of the formants in the spectrum. The sub-band spectral entropy features are used in hybrid hidden Markov mode...
متن کاملFeature transformations and combinations for improving ASR performance
In this work, linear and nonlinear feature transformations have been experimented in ASR front end. Unsupervised transformations were based on principal component analysis and independent component analysis. Discriminative transformations were based on linear discriminant analysis and multilayer perceptron networks. The acoustic models were trained using a subset of HUB5 training data and they ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012